Ranking uncertain sky: The probabilistic top-k skyline operator
نویسندگان
چکیده
Many recent applications involve processing and analyzing uncertain data. In this paper, we combine the feature of top-k objects with that of skyline to model the problem of top-k skyline objects against uncertain data. The problem of efficiently computing top-k skyline objects on large uncertain datasets is challenging in both computing the top-k skyline objects is developed for discrete cases. To address applications where each object may have a massive set of instances or a continuous probability density function, we also develop an efficient randomized algorithm with an E-approximation guarantee. Moreover, our algorithms can be immediately extended to efficiently compute p-skyline; that is, retrieving the uncertain objects with skyline probabilities above a given threshold. Our extensive experiments on synthetic and real data demonstrate the efficiency of both algorithms and the randomized algorithm is highly accurate. They also show that our techniques significantly outperform the existing techniques for computing p-skyline. & 2011 Elsevier B.V. All rights reserved.
منابع مشابه
Link-based Ranking of Skyline Result Sets
Skyline query processing has received considerable attention in the recent past. Mainly, the skyline query is used to find a set of non dominated data points in a multi-dimensional dataset. One of the major drawbacks of the skyline operator is the high cardinality of the result set. Providing the most interesting points of the skyline set (top-k) inherently involves the ranking of the skyline p...
متن کاملPhD Thesis Efficiently and Effectively Processing Probabilistic Queries on Uncertain Data Candidate
Uncertainty is inherent in many real applications. Uncertain data analysis and query processing has become a critical issue and has attracted a great deal of attention in database research community recently. The thesis, therefore, targets an important and challenging topic uncertain data management. It is a high quality and well-written PhD thesis. Five important and related aspects of uncerta...
متن کاملGetting the Best from Uncertain Data
The skyline of a relation is the set of tuples that are not dominated by any other tuple in the same relation, where tuple u dominates tuple v if u is no worse than v on all the attributes of interest and strictly better on at least one attribute. Previous attempts to extend skyline queries to probabilistic databases have proposed either a weaker form of domination, which is unsuitable to univo...
متن کاملProbabilistic Skylines on Uncertain Data
Uncertain data are inherent in some important applications. Although a considerable amount of research has been dedicated to modeling uncertain data and answering some types of queries on uncertain data, how to conduct advanced analysis on uncertain data remains an open problem at large. In this paper, we tackle the problem of skyline analysis on uncertain data. We propose a novel probabilistic...
متن کاملTop-k best probability queries and semantics ranking properties on probabilistic databases
There has been much interest in answering top-k queries on probabilistic data in various applications such as market analysis, personalised services, and decision making. In probabilistic relational databases, the most common problem in answering top-k queries (ranking queries) is selecting the top-k result based on scores and top-k probabilities. In this paper, we firstly propose novel answers...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Syst.
دوره 36 شماره
صفحات -
تاریخ انتشار 2011